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Abstract 

^: 

A new viewpoint on electoral involvement is proposed from the study of the statistics of the proportions 
of abstentionists, blank and null, and votes according to list of choices, in a large number of national 
elections in different countries. Considering 11 countries without compulsory voting (Austria, Canada, 
Czech Republic, France, Germany, Italy, Mexico, Poland, Romania, Spain and Switzerland), a stylized 
fact emerges for the most populated cities when one computes the entropy associated to the three ratios, 
which we call the entropy of civic involvement of the electorate. The distribution of this entropy (over all 
elections and countries) appears to be sharply peaked near a common value. This almost common value 
(— I "' is typically shared since the 1970's by electorates of the most populated municipalities, and this despite 

O ^ the wide disparities between voting systems and types of elections. An even more remarkable stability of 

this entropy value is observed for the Swiss referendums since the 1880's. 

We suggest that the existence of this hidden regularity, which we propose to coin as a 'weak law on 
recent electoral behavior among urban voters', reveals an emerging collective behavioral norm character- 
istic of urban citizen voting behavior in modern democracies. Analyzing exceptions to the rule provide 
insights into the conditions under which this normative behavior can be expected to occur. 

Introduction 

Each election yields a variable proportion of citizens not taking part in the vote. The proportion of the 
uninvolved population - either by non-registering, abstaining or voting blank or null - has been much 
less studied than the vote itself 

Nowadays such behaviors arc increasing among the longest-established democracies and their meaning 
may be changing. Besides passive abstention (due to carelessness or indifference) , an active refusal of vote 
- possibly bearing a political message - is rising among population categories which are usually taking 
part in the election. 

• The modalities of withdrawal [T] 

To measure this phenomenon accurately, we first need to define the non- voter turnout. The boundary 
between voters and non-voters is indeed blurred as several intermediate behaviors exist, such as non- 
registering or blank vote. 

The potential voter population depends on the legal requirements of citizenship, residency and ca- 
pacity. Registration on the electoral roll does not necessarily imply voting. Moreover, the diversity of 
enumeration methods from one country to another makes it difficult to compare directly ratios of vot- 
ers. The main trend consists in comparing abstention to the number of citizens entitled to vote (VEP: 
Voting Eligible Population). However, in the United States for instance, abstention was calculated until 
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recently by comparison to the population above the voting age, including foreigners (VAP: Voting Age 
Population), the corresponding abstention rate often reaching 50%. Another bias stems from the fact 
that some countries made voting compulsory (namely Belgium, Luxembourg, Greece, and for a time the 
Netherlands, Austria and Italy). Without compulsory voting, a declining voter turnout is observed since 
the 1980s in established democracies. 

Moreover, the meaning of blank and null vote is not obvious. They could be considered at first sight 
as equivalent to abstention or non-registering, since they seem to translate an absence of choice. This 
hypothesis would be in agreement with the systematic reviews of the minutes of polling stations for 
instance. 

Abstention has been primarily considered to be a micro-level phenomenon. But is it really? Several 
studies have proven that socio-economic characteristics such as gender [HIS], age [3], education [SJIB] and 
ethnicity [7] have an influence on electoral non-participation. To what extent does living in a community 
with low level of electoral involvement influence a voter? 

• The political and institutional context of the election 

The comparative database collected by the Institute for Democracy and Electoral Assistance (IDEA [8]) 
gathered data from elections in 171 countries from 1945 to 1999. It shows that participation rates are 
slightly higher in countries that have adopted a system of proportional representation, offering a larger 
choice to voters than those which have a majority or mixed systems. The highest turnout recorded (over 
83% observed in both Malta and Ireland) corresponds to the system of 'single transferable vote' which 
gives the voter a large liberty margin. 

The nature of the election may be important too, depending on the context. In France for example, 
as the president has a lot of power, the participation rate of the presidential election is especially high 
when compared to the parliamentary election. 

• Abstention and Blank and null votes 

The reason why analysis of political sciences are paying little attention to blank and null votes is mostly 
based on the fact that these ballots are representing a very small number. Typically, these votes are 
aggregated within a single category. Blank and null votes, in some countries simply called Null (or Invalid) 
votes. Multitudinous studies have demonstrated from the 1950s on that null ballots were subdivided at 
random, according to the law of large number and distributed haphazardly for a given manner of voting [9]. 
The analysis of each voting office is still confirming that. However, the blank votes are more sensible to 
the conjuncture of consultation and are taking, with regard to abstention, a more complex signification. 

Voters casting a blank vote are having motivations closer to voters abstaining for political reasons. 
This "civic abstention" , as Alain Lancelot called it, translates a particular attitude regarding the voting 
procedure [Hllin]- Statistical analysis of abstention, blank and invalid votes show a negative correlation, 
often quite important, between these two ways to not take part in the election. In France it has often been 
observed, the more abstention is important, the more voters are living in most populated municipalities. 
Converse argument, the more they are living in rural area, the less abstention is pronounced. On the 
other hand if blank and null ballots are less numerous in large agglomerations, their number is showing 
an upward trend in smaller municipalities. The "civic abstention" is playing an important part there. 
This correlation between abstention and blank and null ballots shows a tendency to complexify. The 
political attitude of "withdrawal" or political "offside" is less easy to analyze. The urbanization has led 
to important changes in lifestyle and therefore in the voting behavior. We will analyze the situation in 
some countries, for all votes we have been able to obtain data, and hence try to better understand this 
interrelation between abstention, blank and null ballots and the expression of the vote, if existing. 

• Stylized facts 

In this paper we consider together the three values: abstention, blank/null votes and total valid voters. 
The focus will be on the identification of statistical regularities, in the spirit of recent statistical physics 

^This system, called Hare system of voting, is a variant of proportional representation where the voters rank the eandi- 
dates according to their preferences. 
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analysis of elections data - see e.g. pTH25] . 

In the present work, by analyzing a large number of elections in 11 different countries without compul- 
sory voting, we point out that they share a common feature. Introducing a measure of civic involvement 
of electorate, we show that this quantity exhibits a sharply peaked distribution around a common value 
in highly populated municipalities in recent time. Moreover we suggest that this common stylized fact, 
that we denote 'weak law on recent electoral behavior among urban voters', reveals an emerging collective 
behavioral norm, typical of citizen voting behavior in modern democracies. 

The paper is organized as follows. First we describe the dataset used in this study, at three different 
scales (at the municipality scale, at larger scale but for older times, and at the polling station level when 
it is possible). Then, we introduce and discuss what we call the involvement entropy. We then analysis 
electoral data according to this measure, and give signs of existence of a possible norm revealed by a 
common- value of this measure. Supporting Information (SI) gives more details when it is necessary. 

Results 

Dataset 

In this paper we analyze electoral data at three different scales. (1) Data aggregated at the municipality 
scale. By this way, we study phenomena with respect to the population size of municipalities. The 76 
elections studied in this paper at municipality level are mostly recent, after 1990, and are taken from 11 
different countries (Austria, Canada, Czech Republic, France, Germany, Italy, Mexico, Poland, Romania, 
Spain and Switzerland). (2) Electoral data aggregated at large scale, e.g. national, provincial, etc. Here, 
we focus the analysis on time evolution. Countries studied for their historical aspects are those which are 
studied at the municipality scale. The study begins at the earliest year as possible, i.e. at the beginning 
of so-called democratic regimes, after World War II, and even earlier for some cases (e.g. 1884 for the 
w 530 Swiss referendums). (3) Electoral data aggregated at the polling station level. Polling stations 
over the 100 most populated municipalities are analyzed, whenever it is possible to do so (i.e. for Canada, 
Prance, Mexico, Poland and Romania). Some intra-towns phenomena are investigated by this way. 

Some elections are studied as a function of the number N of registered voters by municipality. This is 
the case when the following conditions are valid: (1) elections in a democratic country with no compulsory 
voting, and no duty against people who do not vote; (2) the number of registered voters by municipality 
is well established □; (3) available data provide for each municipality, at least, the number of registered 
voters, the number of votes or the turnout rate, and the number of valid votes. We note that all countries 
for which we have the data at the municipality scale have more than 2000 municipalities, which allows 
us to make statistical analysis. Moreover, all elections studied here are national ones, except for Land 

^In particular this excludes from our study both the U.S.A. and England. 
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Table 1. Countries where elections are analyzed in this paper (first column). Number of 
elections studied at the municipality scale (second column), and the date from which they are studied 
at national or provincial scale (third column) - even if it is before the end of the compulsory voting in 
Austria and in Italy. Star indicates that electoral data are also known at polling station level. Number 
of municipalities per country: « 2400 in Austria (At); « 7700 in Canada (Ca); « 2700 in Switzerland 
(CH); « 6400 in Czech Republic (Cz); « 36000 in Metropolitan France (Fr); « 12000 in Germany (Ge); 
sa 8100 in Italy (It); « 2400 in Mexico (Mx); « 2500 in Poland (PI); « 3200 in Romania (Ro); « 8100 
in Spain (Sp). See in the SI, Section A for more details. 
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Figure 1 



Parliament elections in Germany. Lastly, the choice of the studied elections is not rooted on a plan but 
simply on the availability of electoral data. 

Among these 76 elections, 31 of them are also analyzed at polling station level in the 100 most 
populated town: 5 from Canada (sa 25000 polling stations), 13 from France 7000 polling stations), 
4 from Mexico 55000 ballot boxes), 11 from Poland (w 8000 polling stations), and 4 from Romania 
(sa 6000 polling stations). Tab IT] summarizes the set of elections studied in this paper, and more details 
on these data are given in the SI, Section A. 

Abstentions, valid votes and blank or null votes 

Let us describe the citizen classification here retained to characterize the electoral mobilization of regis- 
tered voters. For each given election and each specific scale (a municipality, a province, a country, etc.) 
we distinguish: (1) the total number N of registered Voters; (2) the number Na of Abstentionists, the 
persons who do not take part to the election; (3) the number Ny of voters, among which (4) Nbn Blank 
and Null Votes El and (5) Nc Votes in favor of candidates or electoral list of choices, also sometimes 
called Valid Votes (see Fig. [T]). Obviously Ny = + Nbn and N = Na + Nc + Nbn- Note that in 
Italy, Spain, and Switzerland, electoral data distinguish between Null Votes, Nn, and Blank Votes, Nb- 
Moreover, only in Spain, "Votos Vdlidos" means iV„ — Nn, that differs from other countries where "Valid 
Votes" means iV„ — Nn — Nb- In this paper, we consider for all countries that Valid Votes are defined 
as Nc — Ny — Nbn- See in the SI, Section F for more discussion about countries where Blank Votes and 
Null Votes are distinguished between each other. 

As discussed in the following, we characterize the civic involvement of registered voters by the choice 
between the three possible sates. Abstention, Blank or Null Vote and Valid Vote. The civic involvement 
of electors is then here measured through the set of the three ratios {pa,Pc,Pbn}, defined by 

with Pa +Pc +Pbn = 1- Each election can then be represented by a point in the simplex pa +Pc+Pbn = 1, 
as illustrated on Fig. [2] Since the number of Blank and Null is typically small, clearly most points lie 
near the edge pbn = 0. However, here we do not want to neglect this component (see below for a deeper 
discussion). A second basic observation is that there is a wide dispersion along the axis Pa ~ Pc- 

Previous work [22j has revealed strong regularities in the fluctuations around the mean of py = 
Ny/N, more exactly of the logarithmic turnout rate t = log jz^, when looking at its distribution over 
municipalities, and particularly over the most populated municipalities. Similarly, a logarithmic three 



''Some countries, like Canada and Poland, aggregate Blank and Null votes in an only one term called as Null votes, or 
Invalid votes, or Spoilt votes. 
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Figure 2. Simplex pa + Pc + Pbn = 1, in which 
any given election, for the most populated 
municipalities, can be represented by a point, as 
illustrated by the symbols corresponding to 
particular elections of our data set (see the SI, 
Section A for details). The continuous curves are 
lines of constant involvement entropy value, 
drawn for values ranging from S' = 0.6 to 5 = 1.4. 
See text for At-2002-D, Ro-2009-E, At-2010-P 
and It-2004-E. For Ca-2008-D: pa ~ 0.49, 
Pc ^ 0.51, ptn ^ 0-003 and S ~ 1.02; for 
Fr-2000-R: Pa ^ 0.71, Pc ~ 0.25, pb„ ~ 0.036 and 
S ~ 1.02; for Fr-1995-P2: pa ~ 0.23, Pc ~ 0.73, 
Pbn ~ 0.044 and S ~ 1.01; for Mx-2003-D: 
Pa ~ 0.59, Pc ^ 0.40, Pbn ^ 0.013 and S ~ 1.04; 
and for Mx-2006-D: pa ~ 0.40, Pc ~ 0.58, 
Pbn ^ 0.012 and S ~ 1.04. 



choices value can be defined, for which the same type of regularities can be observed when considering 
polling stations within municipalities (see the SI, Section D). Moreover, this analysis of fluctuations 
suggests that individual behavior is not well explained by a sequential binary choice (to vote or not, 
then to cast a valid vote or not). This, with the following analysis, justify to consider together the three 
quantities Pa, Pc and pbn- Hence the electoral involvement should be viewed through the three possibilities 
available to the voters: abstention, blank/null votes and votes according to the list of choices. 

In addition, analysis of fluctuations (see the SI, Section D) tells nothing about the mean value itself. 
In this paper, we will precisely be interested in the properties of mean values. 

The involvement entropy 

Wc introduce a variable whose value, as we will argue, is appropriate for characterizing the mean civic 
involvement of the electorate. Viewing the three ratios {paiPciPbn} as probabilities, it is interesting to 
associate to each election, instead of these three numbers, a single scalar characterizing the probability 
distribution itself. One natural quantity associated to a probability distribution is the entropy, S, defined 

by 

S{pa,Pc, Pbn ) ^ -Pa log {pa ) - Pc'^Og(j)c) ^ Pbn log {pbn ) • ( 2 ) 

Here, and throughout this paper, log means base-two logarithm (log(2) = 1, and the entropy is said to 
be in units of bits) . 

Within the framework of Information Theory, where it is called the Shannon entropy, this quantity 
can be understood as a measure of missing information, or of average surprise, associated to the studied 
random process [24]. In the context of Statistical Physics, it is the Boltzmann-Gibbs entropy measuring 
the degree of 'disorder' of the system under consideration [25]. In the present context, we will refer to S 
as the entropy of civic involvement, or "involvement entropy", and consider it as a measure of disorder 
vs. order in the civic involvement at a collective level. Indeed, it is a 'macroscopic' or collective measure 
about the civic involvement of an electorate, and not the measure of the civic mobilization of individual 
citizen - i.e., we do not claim that it corresponds to the behavior of a representative citizen. It can 
be measured at any scale of aggregate data, e.g. for a municipality, a province, or a whole country. 
For instance, the involvement entropy of a municipality, S, is given by Eq. ^ where the three ratios 
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Figure 3. Average values S of the 
involvement entropy of municipalities, S, 
as a function of the number of registered 
voters N, for the first round of Mayor 
elections in Prance. There are two kinds of 
voting rules, which depend on the population-size 
more or less than 3500 inhabitants (see text). 
Inset shows average values of Po, Pc and pbn as a 
function of N for the 2008 municipales elections 
(which lead for high population municipalities to 
a plateau of S despite variations in pa, Pc and 
Pbn)- For each iV, average values, S,p^,p^,pb^, 
arc evaluated over w 200 municipalities of size 
w N. 



{PaTPciPbn} arc thc ratios of, respectively, the number abstentionists, Na, valid votes, Nc, and blank and 
null votes Nbn, over the total number of registered voters in thc considered municipality. 

Let us explain more what we mean by 'order/disorder', and how this is reflected by the entropy value. 
We consider that a civic involvement shows an 'ordered' state if one of the three ratios is very close to 
one (hence the two others very small). A 'disordered' state corresponds to having all three ratios of 
similar values. Within this viewpoint, no particular role or importance is assigned to any one of the 
three possible cases, abstention, blank/null, valid vote. The involvement entropy S, a positive or null 
quantity, provides a well defined way to quantify the degree of disorder: the larger the entropy, the larger 
the disorder. The maximum order is obtained when one of the ratios is equal to unity (and then the two 
others are equal to zero), in which case S = 0. In contrast, the maximum disorder corresponds to an 
equipartition of these 3 ratios, that is Pc = Pa = Pbn = 1/3, in which case the entropy takes its maximal 
possible value, 5" = log(3) ~ 1.58. 

As an illustration, consider the elections for the Mayor in the French municipalities. It is well known 
(at least in France) that participation to elections in small municipalities is typically larger than in large 
cities, for social reasons - for instance, in small municipalities where everyone knows every one else, 
not going to the polling station will become common knowledge. Such social enforcement of thc civic 
involvement might be at the root of an increase of thc number of abstentionists with population size: the 
ratio Pa of abstentionists is typically very low for small municipalities, and increases with the municipality 
size, N . One then expects an increase of the involvement entropy with municipality- size: this is indeed 
what we observe for the elections for the 2001 and 2008 first round (elections for which we have the 
data for all the municipalities), as illustrated on Fig. [H We can say that the electorate is very "ordered" 
(in terms of its civic involvement) for low municipality-size, and gets more "disordered" with increasing 
N . This involvement entropy increase is observed until a threshold population size value, at which the 
electoral rule changes: the citizen has a larger number of possible voting choices in municipalities with 
a number of inhabitants smaller than 3500, than in more populated municipaliticfQ. Remarkably, above 
this critical size, the involvement entropy becomes essentially independent of the population size: one 
has a plateau, at S slightly above 1, despite variations in pa, Pc and pb„. As we will see throughout 
this paper, this particular value of involvement entropy, S* w 1, shows up as a typical value in modern 
elections for most populated cities. 

Let us give other illustrations. A great order of the electorate is provided by: (1) the population of 
registered voters is highly polarized: there is an important difference between pa and Pc {pa ^ Pc or 

^It is allowed for citizens living in municipalities with less than 3500 inhabitants, to combine candidates from different 
opposite lists, or to add new names from citizens who are not officially candidates. 
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Pa 3> Pc)', and (2) blank and/or null votes are very few, that is ptn is very small. Such cases of small 
entropies are, e.g., the 2002 Austrian Chamber of Deputies election for which pa — 0.17, Pc — 0.81, 
Pbn — 0.011 and S ~ 0.73; the 2009 European Parliament election in Romania, with pa — 0.81, Pc — 0.18, 
Pbn — 0.008 and S ~ 0.73. Conversely, a great disorder of the electorate results from: (1) the population 
of registered voters is not very polarized, that is pa and pc are not very different; and (2) blank and/or null 
votes are relatively important, that is pb„ is not too small. For instance, the 2010 Austrian Presidential 
election has Pa — 0.48, Pc — 0.49, pbn — 0.034 and S ~ 1.16; and the 2006 European Parliament election 
in Italy has Pa — 0.29, Pc — 0.66, ptn — 0.053 and 5* ~ 1.11. Note that these values come from great 
town values (see the SI, Tab. SI), whereas S is more spread out in small municipalities (see Fig. 0]). 
Finally, one finds that the involvement entropy S has a value frequently very near 1.0. For example, 
the 2008 Canadian Chamber of deputies election, the 2000 French referendum, the 1995 French second 
round Presidential, and the 2003 and 2006 Mexican Chamber of deputies elections (see Fig. [2] and the 
SI, Tab. SI). In all these examples, despite an important diversity in pa values, S lies within 1.01 and 
1.04, showing that the electorate polarization is somewhat halfway between order and disorder. Note 
that 5 = 1 is the entropy associated to the tossing of a fair coin. In the present context, it would be 
exactly obtained for elections with Pa ^ Pc = 50% and ptn = 0. 

Data Analysis 

We have computed the involvement entropy S for all the elections of our data set, at different scales. 
First we find that, most often, it depends on the municipality-size N. To analyze this size dependency, 
we spread out municipalities data over samples with respect to the municipality population-size. In each 
sample, municipalities have roughly the same number of registered voters. The number of municipalities 
per sample is of order 100, except for France in which case this number is 200 (because France has much 
more municipalities than the other countries studied in this paper) . We denotcH by S the average over 
all municipalities inside a sample of the involvement entropy S. This average S is plotted in Fig. U] as a 
function of the number of registered voters, N. 

Let us give the 1995 French second round Presidential election (Fr-1995-P2) as an example. A rela- 
tively ordered civic electorate involvement is observed for the smallest population-size municipalities, with 
S ~ 0.7. The mean involvement entropy then increases with municipality size, for sizes up to A'^ ~ 10000. 
For the most populated municipalities, that is above this threshold value in population-size, a satura- 
tion occurs: the (average) civic disorder of the electorate becomes independent of municipality-size, with 

Next we consider the time evolution of the involvement entropy at a large scale (country, province, 
canton, etc.). When the scale of aggregate data is lower than the national one, each value of the in- 
volvement entropy for one election is equal to a weighted (by population-size) mean value of involvement 
entropies at lower scale (province, canton, etc.). (See the SI, Section A and Tab. S2 for more details.) 
In the SI, Fig. SI, plots the involvement entropy of each election at large scale, for each country over 
all elections (according to its nature) as a function of time, and Fig. S2 shows how pa and pbn evolve in 
time for Chamber of Deputies election in each country. Nevertheless a rapid evolution in time of 5" can 
be seen in Fig. [SJ where elections in each country are divided into two groups (with roughly the same 
size each): the older ones and the most recent ones. This figure shows histograms of the involvement 
entropy at large scale (and also for pa, Pc and p^n) of 321 elections, according to their relative position 
in time for each country. From the relative older elections to the more recent ones, a significant peak 
appears for 5* ~ 1, that we called an halfway between order and disorder. This point mainly occurs in 
parallel with the significant decrease in time of high ordered elections (in the civic involvement point of 
view). In other words, nowadays there are few elections with a small civic involvement entropy, S (say 

^In this paper, X means the average value of the considered value, X, over all municipalities (around 100, or 200 for 
France) in a given sample v^here municipalities have roughly the same number of registered voters, N; e.g. S, p^, Pbn-. etc. 
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Figure 4. Mean values 5* of the involvement entropy for municipalities, as a function of 
the number of registered voters N. Each point results from an average over a sample of w 100 
(200 for France) municipalities of size « N. Italian graph inset shows a variant of S where Blank Votes 
are grouped with Vahd Votes (see Section F for a deeper discussion). See the SI, Section A and Tab. SI, 
for more details on the data. 
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Figure 5. Evolution in time of involvement 
entropy, S*, at large scale (national, provincial, 
etc.) of 321 elections (see the SI, Section A and 
Tab. S2, for more details), appart from Swiss 
referendums. For each country, electoral results 
are equally divided into two groups: those which 
occurred at the first period in time and at the 
second one. Histograms of S (and Pa, Vc and "Phn 
in the insets) show the involvement entropy of the 
first and second group over all countries. See The 
SI, Fig. SI which plots for each country the whole 
of elections, and also Fig. S4 for scatter plots 
(Pa, Vhn) of these elections, but at national 
aggregate scale. 
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Figure 6. Mean involvement entropy for European Parliament elections. Fig.|B]^a (left panel): 
same elections as those shown in Fig. |4l here for all countries, including France, averages are over ^ 100 
municipalities. Fig. |6]-b (right panel): same elections as those shown in the SI, Fig. ISII the vertical 
dashed line indicates the year of the abolishment of compulsory voting in Italy. Here, Italian Blank 
Votes, Nb, (but not Null Votes, Nn) arc grouped with votes in favor of lists of candidates (sec the SI, 
Section F for more discussion). 



S < 0.8), but there are a lot of elections with 5 sa 1. 
Finally, Fig. IH] shows, for all the European Parliament Elections, how the involvement entropy of 
municipalities depends on population-size (like in Fig. 2]), and the time evolution at the national or 
provincial scale (like in the SI, Fig. SI). 

The common occurrence of S ~ 1 
What the common occurrence is 

As already said. Fig. |4] shows the remarkable fact that, for each studied country, in modern elections 
the involvement entropy of highly populated municipalities is very frequently roughly equal to 1. This 
common value, S ~ 1, for high population-size municipalities is particularly striking when one looks at 
European Parliament Elections (see Fig. [6]-a). See also Table [2] for a rapid overview and basic statistics 
per country about involvement entropies and population size of the ~ 100 most populated municipalities. 
There are however noticeable exceptions, notably the Italian case on which we will come back later 
(Section Discussion). In any case, we have now to better specify what we mean by 5 w 1 and show more 
quantitatively in which way it is a common properties of modern elections. This is done by gathering 
data over all elections after 2000 (after 2000 in order to take into account evolution in time of the 
involvement entropy as stressed by Fig. [5] and Tab. [2|). Fig. [7]-d plots the resulting histograms of the 
involvement entropy restricted to ~ 100 most populated municipalities, for different countries or ensemble 
of countries. Moreover, Fig. [8] shows respectively the minimal length interval of S, Pa, Pc and pbn which 
contain 50% of events (those plotted in Fig. Od). These two figures show a common sharp peak at a 
value of S close to 1. The involvement entropy appears to be mainly in the range 0.98 < S* < 1.08, which 
can be taken as the definition of S* « 1 in this paper. Note that this definition is applied to the most 
populated municipalities. At large scale, the involvement entropy depends on the the way that data are 
aggregated (at national, province, etc. scale), and it is a little bit greater than S for the most populated 
municipalities. Nevertheless the involvement entropy measure at large scale approximately refiects how 
the most populated municipalities do, because an important ratio of population live in the w 100 most 
populated municipalities (as seen in Tab.[2|). 

It is important to stress that the common occurrence 5" « 1 appears (1) as a property of high 
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Figure 7. Histograms of involvement entropy, S, with respect to the relative 
municipality-size bin over all analyzed since 2000. There are 12 French elections, 7 Austrian 
elections, 11 Polish elections, 7 German elections and 24 for others countries (included in one curve, 
with no more than 4 elections per country). Municipalities of each country are divided into bins (of 
« 100 municipalities) with respect to their municipality-size (see e.g. Fig. 2]). For instance, 'Rank 25%' 
(Fig. El-a) means the bin whose population-size rank is the twenty-fifth per cent with regard to the 
sample of the most populated municipalities (Fig. [7]-d) of the considered country. Insets: histograms of 
corresponding pa, Pc and phn- S = 0.98 and S ~ 1.08 are plotted in dashed lines and all the scales axis 
are similar from one plot to another one. 
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Table 2. Basic information about the bin of the wlOO most populated municipalities per 

country (Ctry). riei means the number of elections analyzed. The municipality of this bin with the 
lowest number of registered voters is written as Nmin', the average value of N over these municipalities, 
as A^; and the ratio of registered voters which belongs to this bin over those in the whole country, as 
Nbin/Nctry S IS classified according to values 0.98 and 1.08. 
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Figure 8. Minimal intervals containing 50% of events, of the involvement entropy S and ratios 
Pa-, Pc and pbn of the 100 most populated municipalities over all elections since 2000. See Fig. [7]-d for 
the related histograms. 



populated municipalities, (2) and also in a recent time. See Fig. [SJ or the SI, Fig. SI as an indication of 
the latter point. For the first point, Fig. [7]shows the histograms of the entropy for different municipality 
sizes. Compared with histograms of the most populated municipalities (Fig. [7]-d), histograms of lower 
municipality-size appear: (1) much less peaked (apart from Polish elections), and (2) not peaked at the 
same common- value. Moreover, it is only for the larger sizes that all the histograms become very similar, 
suggesting the convergence to a universal histogram at large sizes. Let us bear in mind (cf. Tab. [2]) that 
the sample of the sa 100 most populated municipalities in Austria is, on average, much less populated than 
the ones of the four other countries or ensemble or countries. [f| In other words, the Austrian sample of the 
w 100 most populated municipalities is not so comparable to the four other ones, and does not accurately 
reflect a typical behavior in large populated municipalities (especially since the civic involvement can 
significantly depend on the population size as it is shown in Fig. 2]) . Lastly, the choice of the number 
(here 100) of most populated municipalities is only for statistical convenience and does not affect the 
results (see e.g. the SI, Fig. S3, which is similar to the Fig. [7]-d, but for the sample of 50 or 200 most 
populated municipalities) . 

Now, let us better quantify this sharp and common peak for the most populated municipalities. 
First, Fig. IHl-a plots the smallest distance {Ssup — Smin), such that 50% of events are included into the 
set [Sinf, Ssup], with respect to the relative municipality size. This confirms that (apart from Polish 
elections) distributions of S get more peaked when the population size increase, and specifically for the 
most populated municipalities. Moreover (apart from the Austrian elections) the minimal distance 
{Ssup — Smin) appears to converge to a common value, this only for the most populated sample (see 
also Fig. m for Sinf and Ssup for this latter sample). Next, in order to quantify the common peak 
phenomenon, we calculate the overlap between distributions of S for municipalities as a function of the 
relative population size (see Fig.|9]-b). The overlap between n distributions of S, with probability density 
functions (pdf) f^{S), i = 1, 2, • • • , n, is defined as 0„ = / min [fi{S), /2(5), • • • , /„(S')] dS . Fig. Hb 
shows an increasing overlap between distributions when the population size increases, and specifically for 
the most populated municipalities. This confirms that the distributions of S get more and more similar 
as the relative municipality-size increases, with (sharp) peaks becoming identical for the most populated 
municipalities. 

^Taking into account the fa 50 Austrian municipalities per sample provides, for the most populated sample, an histogram 
of S much centered on 5 ~ 1 than the one of f» 100 municipalities (see the SI, Fig. S3). 

^The same features also appear by considering minimal distances which contain 25% or 10% of events. This is in 
agreement of the robustness of this trivial method. 
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Figure 9. Quantitative evidence of the sharp and common peak of S for the most 
populated municipalities. Considered elections, the way that bins are ranked, countries or groups of 
countries and legends are the same as in Fig. [71 Left (l9]-a): Minimal interval {Ssu-p — Si„f), which 
encapsulates 50% of events, with respect to the relative population size. Right (jH-b): Overlap O4 
between 4 distributions of S of municipalities as a function of their relative municipality-size (see text 
for the definition of On ) ■ The inset shows in the same manner overlap O3 between 3 distributions of 5* 
((1): all without At, Fr, Ge and PI; (2): Fr; (3): PI). Some curves obtained from reshuffling ptn of 
municipalities (inside one country or ensemble of countries), while Pa is not modified, are also plotted. 



What the common occurrence is not 

We claim that this common most frequent value, S ~ 1 for the most populated municipalities, is not a 
mere statistical artefact. More precisely, we claim that: 

(1) it is not a direct consequence of the law of large numbers, which, for data aggregated at the scale 
of large municipalities, would give a systematic result; 

(2) it is not a result of 'pure chance', that is a bias in the data due to random events, or an accidental 
bias in the collected data; 

(3) it does not only result from having pa and pc neither around 50% nor around a common value: 
there is a wide range of Pa values for which S* sa 1 is observed; 

(4) it does not result from having a small proportion of Blank and Null Votes. 

Let us now justify these claims. 
• About the two first points 

In support of the two first points, we note that there are robust properties which cannot be explained by 
the pure chance or the large number hypotheses. In particular: 

(i) S" « 1 is specific to modern elections. Indeed (apart from Swiss Votations discussed in Section 
Discussion) this common value S \ appears recently, and at different times for different countries - 
and different elections -: in the 70's or 80's in France, 80's in Germany, 90's in Canada, 2000's in Czech 
Republic, etc (cf the SI, Fig. SI). Moreover, there is no systematic way in which recent convergence 
to S* w 1 appears in time. S ~ \ may be reached as well from inferior values (e.g. Chamber of 
Deputies elections in Canada, Czech Republic, etc., in Fig.[6]-b) than from superior values (e.g. European 
Parliament in France in Fig. [6]-b) . Lastly, in a given country, some kind of elections provide at large scale 
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Figure 10. Relative importance of pf,„ on the distribution of S. Left pHl-aV Distribution of S 
from flat distributions oi Pa (wliere pa € [0.3; 0.7]) and pf,n- Tire pdf of S can be peaked for a relatively 
broad distribution of Pa and a small range of pb„, but the peak of the distribution of S', which depends 
on pbn values, is not necessarily centered on 5 ~ 1. Histogram of 5* over the « 100 French most 
populated municipalities (the same as in Fig. [7]-d) is also given as a guiding view. Right (fTOl -b) : pdf of 
S, for different ranges of pbn, over w 100 French municipalities, which depend on their relative 
population-size (like in Fig. [7]-b, c, d). Histograms of S{pa,Pbn) from reshuffled Pa, while ptn remain 
unchanged, are also given. 

5 ~ 1 since their coming (e.g. European Parliament elections ), and for some other ones, 5^1 seems 
(actually) to be an attractor point in time (see e.g. Chamber of Deputies elections in Canada, Czech 
Republic, France, Switzerland, etc. in the SI, Fig. SI). 

(ii) S* sa 1 is only observed for large populations (and there is no common-value for smaller municipality 
sizes) as it is shown in Fig. 19] and there is sometimes a plateau with a lower value of N which both depend 
on the election and on the country (e.g. w 3000 in Canada and Czech Republic, 10000 in France for 
referendums, etc., in Fig.^l). Moreover, there is no systematic way in which convergence to S* « 1 occurs 
as the population size increases. S ^ 1 may be reached as well from inferior values (e.g. Fr-1995-P2, Sp- 
2004-E and Sp-2009-E) than from superior values (e.g. Fr-2000-R, Ge-2004-E and Ge-2009-E in Fig.©. 
Lastly, S ~ 1 may be reached from a discontinuous transition when voting rule (which depends on the 
population size of municipalities) changes. This occurs for the two French local elections for the Mayor 
(see Fig. [21); which arc the only one elections of our database where there is this electoral rule change. 

(iii) The shape of distributions of S for large municipality sizes does not result from a statistical 
bias due to large numbers: creating artificial high populated municipalities, by means of aggregating 
large amount of citizen choices who live in small and different municipalities, does clearly not yield a 
distribution peaked near S ^ 1 (sec the SI, Section C and Fig. S6 for more details). 

Finite-size-effects, that is the effect of aggregating data at different scales, are considered more thor- 
oughly in the SI, Section C, comparing ballot box scale with municipality scale. This section also discusses 
more the issue of statistical effects that could be due to large numbers. 

• About the two last points, concerning the ranges of pa and ptn values 

We show that, even if the distributions of S could be peaked for a relatively broad distribution of pa and 
small values of phn , this can not alone explain why the distributions of S for the most populated towns 
are so much narrowed and, in addition, have their peak at a common value of S. 
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Figure 11. Relative importance of 
reshuffling p^n on S. Analyzed elections and 
the manner that elections are divided into two 
groups arc the same than in Fig. [5] Nevertheless, 
here, each election is aggregated at national scale, 
i.e. S is directly evaluated from the set 
{PayPcPbn} at the national scale. In these figures, 
surrogate S{pa,Pb7i) data, consist in reshuffling 
Pbn from one election to another one in the same 
group, while pa is not modified. Surrogate curves 
result from the average of 1000 realizations, and 
standard-deviations are plotted as error bars. 



Ranges of variation of pa and pb„. 

On one side, while pbn does not radically change in time at large scales, pa has increased during last 
decades in most countries (see e.g. insets of Fig.[5]and the SI, Fig. S2). On the other side, pbn is known 
to decrease when the population-size of municipalities N increases, as it was discussed in the Section 
Introduction. Let us thus first consider the possibility that the common occurrence S* « 1 could be a 
consequence of these two facts: Pa is not too small H (or pc too small) and, independently, pbn is small. 

We give three arguments against this assertion, (i) First, we plot on Fig.[TU]-a histograms of S resulting 
from a fiat and broad distribution of Pa, and a fiat distributions oipbn (with small values). Each histogram 
corresponds to a different choice of the range of (small) pbn values. The result is indeed a set of peaked 
histograms. However, these distributions of S are neither necessarily centered on S' « 1 nor centered at 
a common peak. 

(ii) Second, we emphasis the specificity of most populated municipalities. Fig. [TUl-b plots for French 
data (where the tested phenomenon is clearer) distributions of S selecting elections for which pbn belongs 
to specific ranges of values. Moreover theses distributions are also plotted according to the population-size 
of municipalities. It is only for the most populated municipalities that the distributions of S for different 
ranges of pbn are roughly peaked at the same value 5* « 1 (with a very good agreement for pbn € [0, 0.01 [ 
and Pbn G [0.01, 0.02[). Moreover, for a lower population-size, e.g. with a relative rank of 90%, it is 
interesting to note that distributions of S for different ranges of p6„ (apart from the pbn > 0.03) share the 
same features as in Fig. llOI a. i.e. distributions are peaked in different values. (To have a more detailed 
view, see the SI, Fig. S5, which shows scatter-plots {pa, Pbn) for the municipalities taken into account in 
Fig. [Mb.). 

(iii) Third, there is actually a wide disparities in the ranges of Pa and pbn between different countries 
or group of countries. One can see in Fig. |S] how, (1), France and, (2), all countries without At, Fr, 
Ge and PI, can reach the common S peak, despite largely different ranges of Pa, Pc and pbn- In other 
words, the ranges of ratios Pa, Pc and pbn are not sufficiently similar between countries or ensemble of 
countries to explain why the distributions of S for the most populated municipalities share a sharp peak 
at a common value of S. 

Implied correlations between pa and pbn- 
Hence, it seems difficult to explain the common value S' « 1 for the most populated towns as a consequence 
of having independently pbn small and Pa in a given particular range. The observation of a common peak 

*For example, if pa < 0.227, then it is no more possible to get 5 = 1. 

^To better understand this point, let 52 (pa) = —Pa log(Pa) — (1 ~ Pa) log(l ~Pa) which has a maximal value, 52 = 1, for 
Pa = 0.5. Moreover, when = 0, 52 is equal to the involvement entropy, 5 (defined in Eq. (2)), i.e. 52 (pa) = S(pa,Pbn = 
0). Hence, relatively small variations of pa around 0.5 and very small values of pi,„ lead to 5 ~ 1. 
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around S ~ I thus implies the existence of specific correlations between pa and ptn- 

To test this conclusion, we consider surrogate data obtained by reshuffling the ratios ptn from one 
municipality (or country) to another one, while Pa is kept unchanged (and then pc is deduced from 
Pc = 1 — Pa ~ Pbn)- Note that the marginal distributions of pa and pb„ remain unchanged by this 
reshuffling procedure, whereas their correlations are destroyed. We use this method twice: first, (i) 
contrasting recent and old elections, and second, (ii), considering the dependency in municipality size. 

(i) Figure fTTl shows, at national scale and for two periods of time, how the distributions of S change 
under this reshuffling. p(,„ are reshuffled within the same group of elections. For the first period in time, 
the real distribution of 5, which is not peaked near S* ~ 1, and the surrogate one are not very different 
between themselves. By contrast, the distributions are notably different for the second period. Moreover, 
the main difference concerns the peak near 5 « 1. The peak of the surrogate data distribution is less 
sharp than the one of the real data. This is particularly interesting since pbn is roughly distributed in the 
same manner between the two relative periods in time (see insets of Fig. [S]or scatter-plots (p^, in 
the SI, Fig. S4). The widening of the surrogate distribution of involvement entropy near the peak 5 ~ 1 
can be seen as a sign that there are correlations between pa and p(,„ which enforces the occurrence of 
5 w 1. 

(ii) From a qualitative point of view, the reshuffled data have a peak of S values which is less narrow 
than for the real ones, a discrepancy which increases with municipality size, as can be seen for the French 
data on the inset of Fig. IHl-a, and on the scatter-plots {pa, Pbn) on Fig. S5 of the SI. In addition, the 
distributions of S obtained for the reshuffled data are not as well peaked at a common value as it is the case 
for the real data ones. Quantitatively, for the French data, the Kolmogorov-Smirnov distance between 
the distributions of real and reshuffled data is significantly larger for the most populated municipalities, 
with a distance that allows one to reject the hypothesis that the two distributions are similar (indeed 
the Kolmogorov-Smirnov distance is then 3.0 ± 0.2, while 1.6 corresponds to ~ 1% probability that the 
two distributions coincide). Moreover, Fig. [5]-b shows that overlaps between different distributions of S 
resulting from reshuffled pbn is smaller than for real data, and this only when municipality-sizes are high, 
or even only for the most populated municipality sample: the reshuffling suppresses the high increase of 
overlaps which is observed on real data for the sample of the most populated municipalities. 

We can thus conclude that there is a specific property for the most populated municipalities, which 
is not encapsulated by considering pa and pbn as independent variables. 

Discussion 

We suggest that the common value S* ~ 1 of the entropy, which appears recently in high populated 
municipalities, reveals an emerging collective behavioral norm characteristic of citizen involvement in 
modern democracies, and we propose to call it a 'weak law' on recent electoral behavior among urban 
voters. Signs of existence of this possible norm can not only be seen notably by the greatest density 
value of the involvement entropy S around sa 1, whatever countries, type of elections, etc., but also by 
its deviances. There are two kinds of deviances: for the fist one, S is small (which generally occurs when 
Pa or Pc is very small), for the second one, S is high (which generally results from great ratio of blank 
or null votes, pbn)- We will see that these deviances are associated with a particular phenomena of civic 
involvement, or are simply reduced to the norm (i.e 5 ~ 1) when the meaning of blank votes changes. 

When significantly smaller values are observed (e.g. S < 0.85) for cities, something appear inside 
towns (in average): the heterogeneity of involvement entropy over all polling stations of a given town 
decreases when S of the whole city decreases. In other words, considering the electorate civic involvement 
in a given town, the less is S for the whole town, the more the town appears homogeneous (i.e. involvement 
entropies, at polling station scale, over all polling stations of the town are more homogeneous between 
themselves). Section E of the SI shows this point (free of statistical bias), particularly clear when the 
ratio Pc is high (compared to cases where Pa are high). This civic involvement phenomenon for towns 
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Figure 12. A modified involvement 
entropy, S", where Blank Votes are 
grouped with Valid Votes, with respect 
to the involvement entropy S, for « 500 

Swiss referendums. (See in the SI, Section F 
for a deeper discussion.) Each point 
corresponds to the average of about 10 
referendums. Note the plateau S' ~ 1 for 



S > 1.05. 
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with small S can be seen as a signature of something 'new' which appears when deviance of the norm 
occurs. 

On the other hand, elections where significantly S* > 1, typically corresponds to cases where there 
has been an appeal (from political parties, citizens blogs, etc.) to vote blank or null, which adds civic- 
involvement 'tensions' to the election. It is remarkable that countries which make the distinction between 
blank votes to null votes, provide, by considering blank votes like the valid votes in favor of one of the list 
of choices, a modified involvement entropy S" « 1 whenever the involvement entropy is S* > 1. (When 
blank votes arc grouped with votes according to the list of choices, the modified involvement entropy S' 
is equal to S{pa, Pc+ Pb, Pn) in Eq- ©, and not S{pa, Pci Pb+Pn) as for the usual involvement entropy, 
where pt, Pn and ptn ~ Pt +Pn mean respectively ratios of blank votes, null votes and blank or null votes.) 
Sec the striking plateau in Fig. [12] for Swiss referendums, which shows a modified involvement entropy 
S" 1 when S > 1. Moreover, Section F of the SI clearly shows this point, e.g. for European Parliament 
elections in Italy, and for Referendums in Spain. Hence the fact that S" > 1 boils down to a modified 
involvement entropy ~ 1, by categorizing blank votes as Valid Votes, can be seen as the recovering of 
the 'weak law' by the decrease of civic involvement 'tensions'. The fact that a deviance of the norm is 
naturally reduced to the norm (the involvement entropy is around 1) as soon as blank votes are grouped 
with 'valid votes' can be seen not as an haphazardly occurrence but rather as a signature of the norm in 
a larger sense. 

Now let us discuss a more about the term 'weak law'. In one hand, the common value S ~ I (for the 
most populated municipalities in a recent times) appears as a kind of law of a phenomenon not yet mea- 
sured up to now. This phenomenon concerns the involvement of the electorate, in a civic point of view. 
A kind of law, because it occurs very frequently, without being based on a 'pure chance' phenomenon, 
despite wide disparities across elections, with strong regularities, and, as we have seen, it implies the 
existence of particular correlations between pa and pbn- In other hand, this is clearly not a 'hard law' 
since strong deviations are still existing. One cannot exclude that a larger law exists, encapsulating 
more regularities for the most populated municipalities (e.g. by taking into account the political context, 
number of valid votes for different choices, etc.), and which might explain why 5 ~ 1 appears in recent 
time and do not usually concern small municipalities. It also may be the case that another function than 
the civic involvement entropy encapsulates more regularities about the set of ratios {pa, Pc, Pbn}- In any 
case, we believe that this weak law of recent urban civic involvement shows up as a consequence of some 
robust electoral behavior. As one more illustration, Swiss referendums show (at the canton scale) small 
fluctuations of S near this same value, S* w 1, from 1880s to nowadays (see Fig. [T3l) . 

To conclude, the main finding of this work, based on the analysis of a wide number of elections from 
11 different countries, is that a common stylized fact emerges: in recent elections, the distribution of the 
involvement entropy is found to be sharply peaked near 5 « 1, in high populated municipalities (and thus 
also at national levels) . This universal property is remarkable given the wide disparities across countries 
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Figure 13. Time evolution of the 
involvement entropy, 5*, of 531 Swiss 
referendums, at large scale. Each point 
corresponds to the average (weighted by the 
number of registered voters) over all Swiss 
cantons (25 or 26 in quantity). In red (as a 
guide view): average values over salO 
referendums. The inset show sames things, 
but for ratio of abstentionists, pa, and the 
rartio of blank and null votes, pbn- 



(and even within countries for different elections) in political mores, voting systems, in the way that lists 
of registered voters are established (on a voluntary basis or automatically, etc.), and so on. 

Moreover, S ~ I appears to be very stable in time whenever it occurs for one kind of election, as for 
example European Parliament elections in Western Europe, and particularly remarkably for the Swiss 
referendums since 1884. We propose to designate this strong regularity, neither a 'hard law' nor a mere 
statistical artefact, as a 'weak law' of electoral involvement characteristic of modern democracies in urban 
cities. We suggest that the existence of this weak law is the signature of an emerging collective behavioral 
norm. More studies and analysis would be necessary in order to better understand its conditions of 
realizations and its meaning (at the individual scale and/or at macro scale). Moreover, it should be very 
interesting for forthcoming studies, notably to know if this 'weak law' also occurs in emergent countries, 
in new democratic countries, in great cities (whatever they are), etc. 

The present study calls for a different point of view than those commonly used in Political Sciences. 
We do not work within the classical paradigm explaining the electoral behavior with sociological or ethnic 
even institutional or rational choice variables. Our propose is to change perspective of observation, using 
very large sets of data, looking for regularities - stylized facts -, without restricting the analysis to a 
particular category which could be based on chronology, space, institutional or national specificities. At 
a 'macro' level, using aggregated data, and not at the individual scale, this new view point focuses on (1) 
the involvement or the mobilization of the electorate, and (2) a measure of heterogeneity or, otherwise 
stated, of order and disorder. The question asked here to electoral data is not why a more or less rational 
citizen participates or not to an election, but how is the degree of disorder of civic involvement of the 
electorate. 



Materials and Methods 

The SI Section A gives more information about the set of (public) electoral data studied in this paper. 
Most of them can be directly downloaded from official websites (see the SI, References). Part of the 
database used in this paper can also be directly downloaded from [26] . 

Average values and standard-deviations do not take into account extreme values in order to remove 
some electoral errors, etc. Electoral values greater than 5 sigma (or 3 sigma for the polling stations of a 
town, as in the SI Section E) are not taken into account. 

^"For instance let 100 municipalities of size ^ N (as in Fig. [4}, each one has a civic involvement entropy Si {i = 1,2, 100). 
First, (5) and a are the average value and the standard-deviation of S over these 100 municipalities. Next, the final average 
value S and the final standard-deviation over this sample of 100 municipalities are uniquely evaluated for municipalities, i, 
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Curves resulting from reshuffling procedure give the average values over 1000 realizations, and standard- 
deviations are plotted as error bars. 
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Between order and disorder: 
a 'weak law' on recent electoral behavior among urban voters? 
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Supporting Information 
A. Data 

• Elections studied at municipality scale 

Table [ST] gives more details about the 76 elections studied in this paper at the municipality scale. There 
are: 13 elections from Austria |Tj (« 2400 municipalities) 0; 5 from Canada [2] (« 7700 municipalities); 
1 from Czech Republic |3] (« 6400 municipalities); 20 from Metropolitan France [3] (w 36000 municipal- 
ities); 7 from Germany [5] {~ 12000 municipalities) 0; 4 from Italy [6] (« 8100 municipalities); 4 from 
Mexico [3 (~ 2400 municipalities) F^: 11 from Poland |8] (~ 2500 municipalities) 0; 4 from Romania [9] 

3200 municipalities) F^. 4 from Spain [TU] (sa 8100 municipalities) and 3 from Switzerland [TT] 
(« 2700 municipalities) Fl . 

Table [ST] also gives basic statistics over the « 100 200 for France) most populated municipalities. 



• Time evolution at the national or provincial scale 

The study of time evolution of S is done for the same countries as in Tab. lSll and for all national elections 
for which wc have enough data. For Austria [12], the study considers data since 1945, even if compulsory 
voting was abolished in the whole country in 1992 for National Council elections (D), and after 2004 
for Presidential elections (P) (but in 1982 some provinces had yet done it); for Canada [13], since 1945; 
for Czech Republic jl4], since 1990 0; for France [T5], since 1945 0; for Germany [16], since 1949; for 
Italy [17], since 1945 even if there were compulsory voting until 1993 0; for Mexico [18], since 1991; for 



^^Corrections due to wahlkarten or postal votes arc taking account from the national level, i.e. in this paper, each 
municipality receive from voting cards a number of votes and valid votes proportional to its number of population, and at 
the same ratio for every municipality. 

Chamber of Deputies (D) elections refer to the German Bundestag elections. Land Parliament elections at time less or 
equal to 2004 (or 2010) in each Land are written here as '2004 Ld' (or '2010 Ld'). Postal votes (briehwahlen) are usually 
taken account at Landkreis scale (they are distributed in municipalities, according to their populations), when it is it possible 
to do it. Nevertheless, these corrections provide a very small difference in Fig. 4, especially for high population-size bins. 

-"^^The 2006 Senador election (not studied here) gives a very near statistics of S than the (P) and (D) elections that also 
occur at the same time. 

^■'The Chamber of Deputies (D) election is the Sejm Chamber election. 

-"^^The referendum studied here is about the reduction of the number of parliamentarians to a number of 300 persons, 
and not about the adoption of a unicameral Parliament held on the same time. The latter one is not known at the polling 
station level. 

^®Some Romanian electors, not registered in the lista electorala permanenta, are able to vote. For this country, we pursue 
to write A'^ the Number of Register Voters, A'^^ the registered electors who take part to the election, and Aff,„ the number 
of Null and Blank Votes that the Registered Voters could make (even if the latter data is not known.) Romanian electoral 
data gather for each municipality, A'^, N^, Ny{tot) (the total number of votes), and Ni,„{tot) the total number of Null 
and Blank votes. Assuming that registered electors and not registered electors vote Null and Blank in the same way (i.e. 

^^The referendums or votations 'R,(a)' and 'R(b)' respectively occurred the 11 of March and the 17 of July. The Legislative 
(D) election refers to the Conseil National election. 

^*The 1990 and 1992 Deputies (D) elections only refer to the Parliamentary Chamber of People election. The Parliamen- 
tary Chamber of Nations and the Parliamentary National Council elections, that occurred at the same day as the previous 
ones, also gave approximately the same S value. 

^^AU French electoral data arc from metropolitan France. Some referendums arc not known at the departement scale. In 
these cases, S is evaluated at the national scale. 

■^"We consider the only first question asked to electors in referendums. 
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IQ 


c 


Pa 


Pc 


Pbn KPb I 


la 


c 


Pa 


Pc 


PbnKPb) 


Fr 1992 R 


1.02±0.04 


0.32 


0.66 


0.018 


Fr 1993 D 


1.09±0.04 


0.34 


0.63 


0.028 


Fr 1994 E 


1.12±0.03 


0.48 


0.50 


0.020 


Fr 1995 PI 


0.91±0.04 


0.24 


0.74 


0.018 


Fr 1995 P2 


1.01±0.07 


0.23 


0.73 


0.044 


Fr 1997 D 


1.08±0.03 


0.36 


0.62 


0.024 


Fr 1998 rg 


1.11±0.03 


0.46 


0.52 


0.019 


Fr 1999 E 


1.11±0.03 


0.54 


0.44 


0.020 


Fr 2000 R 


1.02±0.07 


0.71 


0.25 


0.036 


Fr 2002 PI 


1.01±0.04 


0.31 


0.67 


0.019 


Fr 2002 P2 


0.95±0.07 


0.21 


0.75 


0.035 


Fr 2002 D 


1.02±0.04 


0.37 


0.62 


0.010 


Fr 2004 rg 


1.10±0.04 


0.41 


0.57 


0.021 


Fr 2004 E 


1.04±0.03 


0.57 


0.42 


0.010 


Fr 2005 R 


l.OOitO.05 


0.32 


0.66 


0.014 


Fr 2007 PI 


0.72±0.08 


0.17 


0.82 


0.010 


Fr 2007 P2 


0.84±0.06 


0.17 


0.80 


0.032 


Fr 2007 D 


1.04±0.03 


0.42 


0.57 


0.009 


Fr 2009 E 


1.03±0.05 


0.60 


0.39 


0.012 


Fr 2010 rg 


1.06±0.03 


0.57 


0.42 


0.012 


At 1994 D 


0.81±0.11 


0.20 


0.78 


0.016 


At 1995 D 


0.73±0.10 


0.15 


0.83 


0.018 


At 1996 E 


1.04±0.04 


0.33 


0.65 


0.021 


At 1998 P 


l.OOiO.lO 


0.27 


0.70 


0.032 


At 1999 E 


1.06±0.05 


0.52 


0.46 


0.013 


At 1999 D 


0.82±0.09 


0.22 


0.77 


0.011 


At 2002 D 


0.73±0.10 


0.17 


0.81 


0.011 


At 2004 P 


1.04±0.09 


0.31 


0.66 


0.028 


At 2004 E 


1.03±0.05 


0.59 


0.40 


0.010 


At 2006 D 


0.87±0.09 


0.24 


0.74 


0.012 


At 2008 D 


0.88±0.08 


0.24 


0.75 


0.014 


At 2009 E 


1.04±0.04 


0.55 


0.44 


0.009 


At 2010 P 


1.16±0.06 


0.48 


0.49 


0.034 












PI 2000 PI 


0.98±0.03 


0.36 


0.63 


0.006 


Pi 2001 D 


1.09±0.02 


0.52 


0.46 


0.015 


PI 2003 R 


0.98±0.02 


0.37 


0.62 


0.004 


Pi 2004 E 


0.79±0.07 


0.78 


0.22 


0.005 


PI 2005 D 


1.06±0.03 


0.58 


0.41 


0.013 


Pi 2005 PI 


l.O2±0.01 


0.49 


0.51 


0.003 


PI 2005 P2 


l.OSitO.Ol 


0.47 


0.53 


0.006 


Pi 2007 D 


l.O5±0.03 


0.42 


0.57 


0.010 


PI 2009 E 


0.87±0.06 


0.73 


0.27 


0.004 


Pi 2010 PI 


1.01±0.02 


0.43 


0.57 


0.004 


Pi 2010 P2 


1.03±0.02 


0.43 


0.56 


0.007 












Go 2002 D 


0.83±0.07 


0.22 


0.77 


0.009 


Gc 2004 Ld 


1.02±0.04 


0.41 


0.58 


0.007 


Ge 2004 E 


1.02±0.05 


0.59 


0.40 


0.009 


Ge 2005 D 


0.87±0.06 


0.24 


0.75 


0.011 


Gc 2009 E 


l.OOibO.05 


0.60 


0.40 


0.006 


Ge 2009 D 


0.95±0.05 


0.30 


0.69 


0.009 


Gc 2010 Ld 


1.04±0.03 


0.43 


0.56 


0.009 












Ca 1997 D 


1.00±0.04 


0.37 


0.62 


0.009 


Ca 2000 D 


1.03±0.03 


0.44 


0.56 


0.006 


Ca 2004 D 


1.02±0.02 


0.46 


0.54 


0.004 


Ca 2006 D 


1.01±0.02 


0.44 


0.56 


0.003 


Ca 2008 D 


1.02±0.02 


0.49 


0.51 


0.003 












It 2004 E 


1.11±0.12 


0.29 


0.66 


0.053(0.023) 


It 2006 D 


0.78±0.13 


0.17 


0.81 


0.020(0.007) 


It 2008 D 


0.89±0.12 


0.20 


0.77 


0.027(0.008) 


It 2009 E 


1.08±0.10 


0.36 


0.61 


0.034(0.013) 


Mx 2003 D 


1.04±0.05 


0.59 


0.40 


0.013 


Mx 2006 D 


1.04±0.04 


0.40 


0.58 


0.012 


Mx 2006 P 


1.03±0.04 


0.40 


0.59 


0.010 


Mx 2009 D 


l.lldz0.06 


0.56 


0.41 


0.027 


Ro 2009 E 


0.73±0.09 


0.81 


0.18 


0.008 


Ro 2009 R 


1.09±0.02 


0.55 


0.44 


0.017 


Ro 2009 PI 


1.05±0.02 


0.52 


0.48 


0.008 


Ro 2009 P2 


1.04±0.02 


0.50 


0.50 


0.006 


Sp 2004 D 


0.92±0.07 


0.24 


0.74 


0.020(0.014) 


Sp 2004 E 


1.01±0.06 


0.57 


0.42 


0.006(0.003) 


Sp 2008 D 


0.91±0.08 


0.26 


0.73 


0.013(0.009) 


Sp 2009 E 


l.O3±0.04 


0.56 


0.43 


0.009(0.006) 


CH 2007 R(a) 


1.04±0.04 


0.53 


0.46 


0.008(0.004) 


CH 2007 R(b) 


0.99±0.06 


0.62 


0.37 


0.007(0.004) 


CH 2007 D 


1.04±0.05 


0.53 


0.47 


0.009(0.002) 












Cz 2003 R 


l.OTitO.Ol 


0.47 


0.52 


0.012 





Table SI. Elections studied in this paper at the municipality scale. An election is identified 
(Id) by its country, its year date and its nature. D: Chamber of Deputies election; E: European 
parliament election; P: presidential election (according to the constitution of the country, in only one 
round); PI and P2: first and second round of a Presidential election; R: Referendum; Ld: German 
Lander elections; rg: French Regionales elections. For each country elections are given in a 
chronological order (but the 2006 Mexican Presidential (P) and Deputies (D) elections occurred the 
same day, and also for the 2009 Romanian Presidential (PI) and Referendum (R) elections). Even if an 
election needs two rounds, only the first one is considered (e.g. the French Deputies (D) and Regionales 
(rg) elections) unless the contrary is indicated (e.g. PI and P2). Mean values of 5, Pa, Pc, Pbn(and (p&) 
if Blank Vote are distinguished between Null Vote), and also standard deviation only for S, are given 
over the bin of the « 100 (or « 200 for France only) most populated municipalities. In bold text, 
S e [0.98; 1.08]. 
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Country Kind of elections 



Scale of aggregate data 



At D, E, P, R 

Ca D 

CH D, R 

Cz D, E, R, rg, SI, S2 

Pr Cant, D, E, PI, P2, R, rg 

Ge D, E 

It D, E, R, S 

Mx D, P 

PI D, E, PI, P2 

Ro D, E, PI, P2, R 

Sp D, E, R 



National 
Province (5-13) 
Canton (25-26) 



National 

departement (90-96) 
Land (9-16) 



National 
National 
National 
National 



Comunidad autonoma (17-19) 



Table S2. Elections studied in this paper for their global 5" as a function of time. Notation 
is the same as in Tab. ISll For Czech Republic, "rg" means Election into regional councils, "SI" and 
"S2" are respectively the first and second round of the Senate elections; for France, "Cant" refers to the 
Cantonales elections and some referendums are only known at the national scale; for Italy, "S" means 
Senate elections, and occur at the same time as Deputies elections (D) but with older registered voters. 
In parenthesis, the total number of different provinces (or Cantons, etc.), which can change in time, in 
the whole country. 

Poland US], since 1990 E!l; for Romania [ID], since 1990; for Spain gT], since 1976; for Switzerland [H], 
since 1884 for referendums (R) and since 1919 for legislative elections (D). If an election needs two rounds, 
the first one is considered, unless the contrary is indicated. The Mexican, Polish and Romanian Senate 
elections are not shown here because they occur at the same time as Chamber of Deputies elections and 
have very similar S results. 

Table [S2] summarizes the nature of elections studied in this paper, and also the scale of aggregate data 
per country. Note that the last election analyzed in this paper is the Referendum which held in Italy on 
June 2011. El 

• Elections studied at polling station scale 

Polling stations analysis is restricted to polling stations which belong to one of the 100 most populated 
municipalities (for the considered election). 31 elections at the polling station scale are studied in this 
paper: 5 for Canada (each Canadian election of Tab. IS1[) . with around 25000 polling stations; 13 for 
France (French elections of Tab. ISll since 1999), with around 7000 polling stations; 4 for Mexico (each 
Mexican election of Tab. ISip . with around 55000 polling stations or ballot box; 5 for Poland (Polish 
election of Tab. ISll from 2003 up to 2005), with around 8000 polling stations; and 4 for Romania (each 
Romanian election of Tab. IS1|. with around 6000 polling stations. See Tab. IS3I for some basic statistics 
over polling stations of the 100 most populated municipalities. 



^^We have not data from the 1989 Chamber of Deputies (Sejm) election nor the two referendums in 1996. 

Official results (which took into account registered voters) of the Canadian Chamber of Deputies election, hold on May 
2011, were not published at the time we first submitted this paper. In Fig. ISll the involvement entropy over all provinces 
would be 1.00 ± 0.02 and respectively 0.99 and 1.02 for Ontario and Quebec. 
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Id 


S 


T3 


Id 


S 


T3 


Fr 1999 E 


1.09 ± 0.05 


-3.7 ± 0.6 


Fr 2000 R 


1.00 ± 0.11 


-4.2 ± 0.6 


Fr 2002 PI 


1.00 ± 0.06 


-2.1 ± 0.6 


Fr 2002 P2 


0.93 ± 0.10 


-0.7 ± 0.6 


Fr 2002 D 


1.01 ± 0.06 


-3.2 ± 0.7 


Fr 2004 rg 


1.09 ± 0.05 


-2.8 ± 0.6 


Fr 2004 E 


1.03 ± 0.05 


-4.6 ± 0.7 


Fr 2005 R 


0.99 ± 0.07 


-2.5 ± 0.7 


Fr 2007 PI 


0.71 ± 0.11 


-1.3 ± 0.7 


Fr 2007 P2 


0.83 ± 0.09 


-0.1 ± 0.7 


Fr 2007 D 


1.03 ± 0.04 


-3.8 ± 0.7 


Fr 2009 E 


1.02 ± 0.07 


-4.6 ± 0.7 


Fr 2010 rg 


1.04 ± 0.05 


-4.4 ± 0.7 








Ca 1997 D 


0.98 ± 0.08 


-3.3 ± 1.3 


Ca 2000 D 


1.00 ± 0.06 


-4.1 ± 1.1 


Ca 2004 D 


1.00 ± 0.05 


-4.4 ± 0.9 


Ca 2006 D 


0.99 ± 0.05 


-4.4 ± 0.8 


Ca 2008 D 


1.00 ± 0.05 


-4.6 ± 0.9 








PI 2003 R 


0.95 ± 0.10 


-4.0 ± 0.9 


PI 2004 E 


0.83 ± 0.13 


-6.3 ± 0.8 


PI 2005 D 


1.05 ± 0.08 


-4.1 ± 0.9 


PI 2005 PI 


1.00 ± 0.07 


-4.9 ± 0.9 


PI 2005 P2 


1.01 ± 0.05 


-4.1 ± 0.9 








Mx 2003 D 


1.03 ± 0.07 


-4.3 ± 0.9 


Mx 2006 D 


1.02 ± 0.07 


-3.2 ± 0.8 


Mx 2006 P 


1.01 ± 0.07 


-3.4 ± 0.8 


Mx 2009 D 


1.11 ± 0.10 


-3.5 ± 0.9 


Ro 2009 E 


0.70 ± 0.13 


-6.6 ± 0.9 


Ro 2009 R 


1.08 ± 0.05 


-3.8 ± 0.7 


Ro 2009 PI 


1.04 ± 0.03 


-4.5 ± 0.7 


Ro 2009 P2 


1.04 ± 0.03 


-4.4 ± 0.7 



Table S3. Elections studied at the polling station level. An election is identified (Id) by its 
country, its year date and its nature. Mean value and standard deviation of S and of (see the SI 
Section D) over ballot boxes in the 100 most populated municipalities. 
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B. More details on data analysis 
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Figure SI. Time evolution of the mean involvement entropy at large scale (national, 
provincial, etc.). See Section A and Tab. IS2[ for more details and also for the end of compulsory voting 
in Italy (cf. vertical dashed line) and in Austria. Whenever the scale of aggregate data is lower than the 
national one, standard-deviations (weighted by the number of registered voters) are also shown as error 
bars. Italian and Spanish graph insets show a variant of 5* where Blank Votes are categorized as Valid 
Votes (see Section F for more discussion). See text for more explanation about some French curves. 



Fig. [ST] gathers all the available data (see in the SI, Section A for more details) at a large aggregate 
scale (country, province, departement, etc.). When the scale of aggregate data is lower than the national 
one, each point corresponds to a weighted (by population-size) mean value of involvement entropies at 
lower scale (province, departement, etc.), and standard deviation is also given as error bar. The cases 
where Blank Votes are distinguished from Null Votes (i.e. in Italy, Spain and Switzerland), call for a 
specific discussion (see the SI, Section F). 

Let us comment Fig. [ST] on the case of the Chamber of Deputies elections in France, at the large 
scale called departement (96 in quantity for metropolitan France, actually). One sees an involvement 
entropy frequently equal to w 0.8 until 1981, which then increases and gets greater than 1 until 2000, 
and decreases a little and stabilizes to S ^ lafter 2000. So, the civic involvement of the electorate (at the 
departement scale) is relatively ordered until 1981 and get more and more disordered until 2000. After 
2000, 5* seems to stabilize to a common value 5^1 which is also reached for the European Parliament 
elections and for local elections at different scales, such as the Regionales (~ states) and the Cantonales 
(^ counties) elections. 
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Figure S2. Moving average, as a function of time, per country of -pa and pb„ at national 
scale for Chamber of Deputies elections. The average is made over 4 elections. Left: about ratio of 
registered voters who do not take part to the election (pa); Right: about Blank and Null ratio (pbn)- 
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Figure S3. Histograms of S for the w 200 (left) and 50 (right) most populated 
municipalities, similarly to Fig. 7-d (with 100 most populated municipalities for the latter one). 
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Figure S4. Evolution in time of scatter plots of {pa, Pbn) at national level of 321 elections. 
Elections arc divided into the two groups in the same manner as in Fig. 5. Curves give the sets of 
points (pa, pijn) such that S{paTPbn) is equal to one of the two endpoints of the minimal interval of S 
which contains 50% of events. Note if S is equal to the average value (weighted by the population size) 
at lower aggregate scale (as provinces, departements, etc.) like in Fig. 5, the peak of S near S ~ 1 would 
be more narrowed and more centered on 5 = 1 
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Figure S5. Scatter plots of {pa, Pbn) of French municipalities according to their relative 
population size, over elections since 2000 (similarly as in Fig. 7-b, c, d). The sets of points {pa, Pbn) 
such that S{pa,Pbn) is equal to one of the two endpoints of the minimal interval of S which contains 
50% of events (as in Fig. 8 for the most populated municipalities) are also plotted. 
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C. Finite size effects 



We show in this section that finite size effects over municipality-size, N, on the entropy-involvement 
S, are relatively small for the most populated municipalities. Biases due to finite size effects have two 
possible origins: (1) level of aggregation of the data, over N about a hundred to a million, influences S 
measures, and (2) a statistical effect due to large numbers. Without a loss of generality, we examine these 
two biases for French electoral data - with 20 elections at the municipality scale and 13 at the polling 
station level, cf. the SI Section A. Lastly we show that the distribution of the involvement entropy which 
is sharply peaked near S ~ I for most populated towns is not due to considering a large number of 
per town. 

(1) Scale at which data are aggregated 

French municipality sizes range from around 10 to around 100, 000. In order to investigate how 
aggregate data scale modifies the measurement of the involvement entropy S, for each municipality we 
compare the results at the municipality scale with the one done at the polling station scale. Registered 
voters per polling station do not exceed around one thousand in France. We compare for a municipality its 
involvement entropy, S, measured at the municipality level, to the mean value, Spg, of the involvement 
entropy over all the polling stations in the the considered municipality. Convexity of the logarithmic 
function implies that the later is at most equal to the former. For each of the 200 most populated French 
municipalities, and for each of the 13 French elections known at the polling station scale (see the SI 
Section A), the gap between S and Sps is less than about 2% (except for very few and typical recording 
errors of electoral data). Moreover, averaging S and Sps over samples of w 200 municipalities of similar 
sizes N provides a difference less than 1% for A^ > 1000. 

In short, for large population municipalities, the bias introduced by the scale at which data are ag- 
gregated is weak and does not affect the main conclusions of the paper. 

(2) Statistical effects due to large numbers 

Let us see if statistical fluctuations due to flnite size effects considerably modify the expected values 
of involvement entropy. Indeed, For independent events, according to the central limit theorem (under 
conditions broadly applicable) fluctuations are on the order of 1/^/N. This is expected to be the case 
for the ratios Pa and pbn, which should then lead to a bias in the entropy value. We want to estimate 
this bias and see if it is negligible (say less than 1%). To do so, we make a simulation with artificial 
data. For calibrating these data, we make use of the sample of the most populated municipalities. We 
measure the average values and ptn of Pa and pbn over all municipalities in this sample of the largest 
municipality-size; and the corresponding standard deviations Ua and CT6„. The surrogate data consists in 
a same number of "municipalities" , each one characterized by the same population size as in the empirical 
data. For these surrogate-municipalities, we draw the numbers of Abstentionists and of Null-Blank votes 
from binomial distributions, parametrized by the empirical average values and standard deviations of pa 
and Pbn, as follows. 

Let a surrogate-municipality with A^ registered voters. Its numbers of Abstentionists, Na, and Null- 
Blank votes, Nbm are drawn from a binomial distribution such that: 

Na^B{N ; p^ + Va), 
Nbn^B{N ;p^+T]bn), (SI) 

where rja and rjbn are independent random Gaussian noises of mean and of standard deviation aa and 
tTbn, respectively. Note that here, for each citizen in a surrogate-municipality, probabilities to not vote 
and to put a null-blank vote are mutually independent. 

Now, we can compare the average values S'(A^) of municipal involvement entropy in a sample of « A^ 
surrogate-municipality-size, with S{Njnax) in the sample of the most populated municipalities. We find 
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that the difference is less than 1% when N > 2000. In other words, for municipahty-size greater than 
around 2000, statistical fluctuations due to finite size effects are negligible for what concern the present 
study. 

To conclude, we have seen that, for French electoral data, finite size effects do not affect significantly 
the municipal involvement entropy (i.e. by less than a 2% deviation) for N greater than 2000. Note 
that 2000 is much less than the typical municipality size of the most populated municipalities, for which 
the common value S* 1 is frequently found. Lastly, the same analysis done for other countries for 
which electoral data are also available at the polling station scale (see the SI Section A) give the same 
results (see e.g. mean values of S over the 100 most populated municipalities, at the municipality scale 
in Tab. ISI[ compared to those at ballot box scale in Tab. IS3|) . 
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Figure S6. Histograms of S of the 100 most populated towns compared with 100 artificial 
towns (see text), in France, over elections since 2000. 

Now, let us show that the shape of the distribution of S over the 100 most populated towns (which is 
sharply peaked near 5 « 1, apart from Austria) does not result from aggregating a large number of the 
citizen choices. In other words, the shape of the distribution of the involvement entropy for the 100 most 
populated towns (cf. Fig. 7-d) cannot be explained by a statistical bias due to a large number effect. 

In order to see this point, 100 artificial town is created - in France, without he loss of generality. 
Each artificial town results from the aggregation over 300 real small municipalities of real numbers of 
registered voters (N), abstentionists {Na), blank and null votes (Nbn) and votes according to the list 
of choices (Nc). In other words, an artificial town comes from the aggregation of real citizen choices 
who live in small municipalities. Each municipality is taken into account only once. These 100 French 
artificial towns have artificial aggregated registered voters (TV) from 7000 to 330000, and is equal to 34000 
in average. Fig. IS6I allows one to compare the real distribution of 5* of the most populated French towns 
over elections since 2000 with the one which results from these 100 artificial towns. These two histograms 
are clearly different. 

To conclude, the shape of the distribution of the involvement entropy of most populated towns (cf. 
Fig. 7-d) is not due to a bias rooted in aggregating a large number of citizen choices. The shape itself 
depends on real citizen choices who live in these towns. 
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D. Logarithmic three choices value, ts, of polhng stations 
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The aim of this section is to introduce a variable, called logarithmic three choices value, T3, which 
takes into account the set of three ratios {pa,Pc,Pbn}- First, distribution of T3, over polling stations 
in the 100 most populated municipalities appears stable over time, and also similar between different 
countries. Secondly, we give arguments against two successive binary choice decisions leading to blank 
and null votes or votes according to the list of choices (i.e. to vote or not, then to cast a valid vote). This 
yields to consider consider together the three quantities Pa , Pc and pbn in order to deal with the electoral 
involvement. 

• Threshold decision rule of a unique binary choice. 

Let an agent i, its decision (n^ = or 1) is written as rii = Q{hi+H), with &{x > 0) = 1 and &{x < 0) = 0. 
hi is a kind of idiosyncratic term of the agent i. Idiosyncrasies are considered as uncorrelated between 
agents, and are described as independent random variables. iJ is a global field apply to agents, like a 
'Cultural field' [23], etc. Note that there is no interaction between agents in this rough decision rule. 

According to this decision rule, the aggregate value like the ratio p of the decision +1 over agents 
writes as p = 7'>(— where Vy{~H) means the cumulative distribution of idiosyncrasies h, i.e. 
Vy{—H) = J'_^ P{h)dh. If idiosyncrasies are assumed to be distributed according to a logistic distribu- 
tion, P, of zero mean and of unity width, it comes that 



(Applied to voter turnout Pv = 1 — Pa, distribution of logarithmic turnout rates r = In ( ) across 
French municipalities is surprisingly stable over time |23] . that allows to make predictions - confirmed 
by real measures [MlllS].) 

• 2 mutually exclusive threshold decisions 

Now, let us assume that the two decisions, (1) to vote according to the list of choices and (2) to cast a 
blank or null vote, are mutually exclusive. In other words, abstentionists are considered like a reservoir 
from which agents decide to make or not the choice (1) or the choice (2); moreover if they decide to do 
choice (1) (or conversely (2) ), they do not decide anymore to make or not the choice (2) (or conversely 
(1) ). Thus, let He the global field in favor of the choice (1) (to vote according to the list of choices), 
and p° the global ratio if choice (1) was unique, i.e. without any existence of choice (2) (see Fig. ISTl-a). 
Conversely, iJbn and refer to choice (2) (to put a blank or null vote) if it was a unique choice. From 

^^Finitc size effects are here neglected. 




(S2) 
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Figure S8. Logarithmic three choices value, t^, with respect to involvement entropy S. 

Measures come from mean values of T3 and S over polling stations in each of the 100 most populated 
towns. Curves are smoothed. Note that there is not a one-to-one relation between rs and S. 



Eq. (jS2p . and still assuming a logistic distribution of idiosyncrasies, it comes 

p°,=r>i-H,), thus, H, = \n' P° ^ 



1-pO' 

,0 



pL = V^i-Hbn), thus, Hbn = In (-^^). (S3) 



Pin 

In agreement with what precedes, choice (1) (or conversely (2) ), is made over agents who have not 
decided to make the other choice (2) (or conversely (1) ). By this way, observed ratios pc and pbn that 
exist when the two choice exist in the same time, are related to and (that would exist if each one 
uniquely was existed) such that 

pI-^ 



pL = 1^- (S4) 



Writing T3 = He + Hbn, the sum of civic global fields applied to registered voters in this 3 choices 
process (leading to pc, Pbn and Pa), Eqs. (jS3IS4|) yield toF^ 

r,^H%^y (S5, 

• T3 of polling stations in most populated towns 

First, there is not a one-to-one relation between the logarithmic three choices value, ra, and the involve- 
ment entropy S. Indeed, it is enough to invoke that the three ratios {pa,Pc,Pbn} play a symmetric role 
for S, and not for T3. Fig. [S8] plots ra with respect to S for their average values over polling stations 
in each of the 100 most populated towns (see also Tab. [S3] for basics statistics of S and T3 over polling 
stations in the 100 most populated municipalities). 

Fig. [Ml shows the pdf of the logarithmic three choices value T3 over different polling stations of the 
100 most populated towns in each country (apart from Canadian ones because more than third of polling 
stations have pbn = 0, which lead to their logarithmic three choices values are undefined), i.e. the 
probability P(T3)dT3 that a given polling station, inside the 100 most populated towns, has T3 to within 
dra. Although the average (ra) over these polling stations varies quite substantially between elections 
(see Fig. IS8I and Tab. IS3p . the shape of the distribution of T3 — (T3) is quite constant for each country, 
and particularly for France and Mexico. 



*When one of the three ratios {pa,Pc,Pbn} is equal to zero, ra is undefined. 
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Figure S9. Distribution over polling stations of the 100 most populated towns of 

P{t3 — (r)) for each election, where T3 is the logarithmic three choices value and (ra) its average 
value over all concerned polling stations. 





Normalized 



Figure SIO. Distribution of normalized T3 over polling stations of the 100 most populated 
towns for 26 elections. The dotted line and the dashed line show respectively P1-2003-R and P1-2005-D 
elections. A normalized Gaussian is also plotted. 



Moreover, distributions of normalized T3 (i.e. T3 = where (ra) and a are respectively the mean 

value and the standard deviation of over polling stations of the 100 most populated municipalities) 
appear to be similar with one another, as well in a same countries as in other countries, for French, 
Mexican, Romanian and half Polish elections (see Fig. IS10| ). In fact, a Kolmogorov-Smirnov test where 
one only allows for a relative shift of the normalized distributions ^(ra), over polling stations of the 100 
most populated towns, does not allow one to reject the hypothesis that the distribution P(t3) is indeed 
the same for all elections (from France, Mexico, Romania and half ones in Poland). 

Note that we have restricted this study to the polling stations of the 100 most populated towns be- 
cause the common- value 5 ~ 1 of civic involvement appears for municipalities with high populations. 

• Two successive binary threshold decisions 

Now, let us show why it is preferable to consider two exclusive decisions (cf. Fig.[S7l-a) than two successive 
decisions (cf. Fig. [S7]-b) related to the civic-involvement of the electorate. For the latter case, the first 
binary decision is to vote or not to vote, and the second binary decision is to decide to cast a valid vote 
(according to the list of choices) or to put an invalid vote (i.e. a Blank or Null vote) knowing that the 
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considered agent is a voter (i.e. the first decision is to vote). In other words, the decision to put a vote 
according to the hst of choices (or conversely to put a Blank or Null vote) results from two successive 
binary decisions: first to vote, second to put a valid vote (or conversely to put a Blank or Null vote) 
knowing that the first decision is to vote. 

Let the global field related to the first decision, i.e. to vote (see Fig. IS7l -b). Thus the ratio of 
voters, py, over registered voters writes as: 



(Remind that Pv = I — Pa = Pc + Ptn-) Let Hc\v the global field related to the second binary decision, 
i.e. knowing that the agent is a voter, to put a vote according to the list of choices. So, the ratio of votes 
according to the list of choice over voters is written as 



With the opposite convention to the previous one, the second decision to put a Blank or Null vote is such 
that Hhn\v = -Hc\v (since pbn/Pv = (-7?f,„|i,) and 7?f,„|^, = In (^) ). 

According to this two successive binary choices, the global fieM which leads a registered voter to put 
a Valid vote is = Hy + Hc\y = In { p^.J' ); and to put a Blank or Null vote is = Hy + Hi„-,\y = 
In (y-^)- When Blank or Null ratio is very small {phn <C 1), Pc — Pa, hence leads to Hj^^^ ~ T3. So, 
statistics of are very near to those of T3. 

If this successive binary decisions point of view is roughly correct, and H'^ should share main 
features. Nevertheless, this is strongly rejected by real data. The shape of the distribution H^^ — (if^„) is 
not constant at all (not shown in this paper) for each country over various elections over polling stations 
of the 100 most populated towns, and also confirmed by the Kolmogorov-Smirnov distance between 
two distributions. This allow us to prefer the hypothesis of two mutually exclusive binary decisions 
(which leads to T3, and its surprising stability over time and countries) compared to the hypothesis of 
two binary decisions (which should lead to H'^ or Hj^^^ which should have the same features) about the 
civic- involvement . 




(S6) 




(S7) 
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E. Looking for signs of tension, through polhng stations analysis 




s s s s 

a a a a 

Figure Sll. Civic-involvement heterogeneity (at the polHng station scale) in a town with 
respect to the involvement entropy of the town. Curves arc smoothed and concern the 100 most 
populated municipalities. Benchmark (see text) curves are plotted in the insets. Heterogeneity 
measures result from standard deviation of involvement entropy of polling stations (|Slll a) , 
KuUbackLeibler divergence between a polling stations and other polling stations of the town jSTTJ-b), 
and standard deviation of logarithmic 3 choices value of polling stations (jSlll -c). Fig. ISllM : same as 
Fig. ISlll a. but restricted for the 5 elections which deviate the more from 5 « 1, where plain lines and 
dashed line plot respectively real data and benchmark curves. 

This section seeks to detect some 'tension', in connection with the involvement entropy. We follow 
the assumption that 'tension' have some effects for polling stations heterogeneity inside a town. In 
other words, we try to detect some specific variation of polling stations heterogeneity in a given town, in 
connection with the involvement entropy of this town. Polling stations (inside a same town) heterogeneity 
is investigated by three different ways: (1) standard deviation of involvement entropies over all polling 
stations of the considered town; (2) KuUbackLeibler divergence from one polling station compared to 
other polling station of the town; (3) standard deviation of the logarithmic three choices value (because 
the shape of its distribution is stable, see the SI Section D) over all polling stations of the town. 

The analysis uses polling stations inside the 100 most populated towns (see the SI Section A for more 
details). Real results will be compared to a benchmark. The benchmark is based on the same hetero- 
geneity of ratios Pa (idem for pc, and pbn) of polling stations for every town. 

Let a town and a polling station of this town respectively called a and ai . The polling station ai has 
some measures, for instance its number of registered voters Na- , and the set of 3 ratios {pa, at , Pc, ai , Pbn, at } 
that provides its involvement entropy Sa^ and its logarithmic three choices value T3. q, . . The average over 
all the polling stations of the town (weighted by the number of registered voters, iVa.), gives the corre- 
sponding value for the whole town a, e.g. the set of 3 ratios {pa.onPc,a,Pbn,a\ , its involvement entropy 
Sa^ and its logarithmic three choices value T3_q. The weighted (by the number of registered voters) 
standard deviation over all the polling stations Ui of the town a is written as 5[..]q, like for instance 
(5[pa]a, etc., 5[S]a and (5[r3]ct. These quantify the heterogeneities within the town a. 

Fig. ISlll -a plots involvement entropy heterogeneity of a town a (i.e. (5[S']q) with respect to its in- 
volvement entropy (i.e. Sa)- One should pay attention to the fact that 5[S]a going trough a minimum 
as Sa ^ ^ could just be a consequence of \dS\ having a minimum near pf,™ ~ and Pa ~ 0.5. Hence the 
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benchmark presented here consists in comparing the empirical data with surrogate ones for which the 
heterogeneity in pa, Pc an pbn is the same for all municipalities, up to a binomial noise. 

Here, the benchmark forces the same heterogeneity of the set of ratios {pcnPcTPbn} for every town, 
but keep their initial value of {pa,Pc,P6n} for the whole town. In other words, let a town a, 6[pa]a, S[pc]a 
and d[pbn]a have the same values than in other towns; but {pa.a,Pc.a-,Pbn,a} sltc the real values of the 
town a, measured by the election. 

The benchmark is realized as follows. First, we measure for each town a, Pa,a and Pbn.a] and also 
<5[Pa]Q and 5[pbn]a- The average values of heterogeneities (5[pa]Q and 5[pbn]a over the 100 considered 
towns are respectively written as 5p^ and (5p^^ . Secondly, we drawn from a binomial distribution, for each 
polling station a,;, its number of registered voters who do not take part to the election {Na.ai) and the 
number of Blank and Null votes (Nbn.ai), such that: 

Na.a, ^B{Na, ; Pa.a+Va), 

i Phn, a. + Vbn), (S8) 

where rja and r]bn are independent Gaussian noises of mean and of standard deviation 5p^ and Sp^^^ 
respectively, and Na^ is the real number of registered voters of the polling station of the considered 
town a. Note that we use a binomial distribution in order to take into account finite size effects; and 
here, for each citizen in a surrogate-polling station, probabilities to not vote and to put a null-blank vote 
are mutually independent. 

Instead of making use of standard-errors, an alternative measure of heterogeneity is provided by 
making use of the so-called KullbackLeibler divergence which characterizes the difference between two 
probability distributions. For each polling station ai of a given town a, we compute the divergence 
DKLai from the polling station distribution P^^ to the rest of the town, = Pa-ai, 

i?i^L„,^^P„.(j)loggi|| (S9) 

where, here and in the following, for any distribution we write P{j),j = 1,2,3, instead oi Pa,PcTPbn- 
Then we compute the mean KullbackLeibler divergence, DKLa, of the town a by averaging over all 
polling stations, weighting by the corresponding number of registered voters, Na-. 

This mean KullbackLeibler divergence DKLa gives us a measure of heterogeneity of polling stations into 
a town a. 

Fig. IS 1 ll compares benchmarks curves and empirical data. It appears that, the smaller the involvement 
entropy S (with S < 0.85), the smaller the involvement entropy heterogeneity at the polling station level 
(see more specifically Fig. ISlll -d) . Heterogeneity of polling stations in a same town is measured via three 
different ways: standard deviations of the involvement entropy and the logarithmic three choices ratio, 
and also via the KullbackLeibler divergence. In other words, the more the town is "ordered" (for its 
electorate civic-involvement), the more the town is homogeneous (at the polling station scale, and still 
for a civic involvement point of view). Note also that this point is particularly clear when the ratio pc 
is high (e.g. for 3 French elections), compared to cases where Pa are high (e.g. for European Parliament 
elections in Romania and Poland). It can also be noted that often heterogeneity of involvement entropies 
of polling stations inside towns ((5[S']q,) has a significantly minimal value when their involvement entropies 
{So) are around 1, and this minimization is much more marked for real data than for benchmark ones. 
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F. Disentangling Blank votes from Null votes 
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Figure S12. Blank votes are grouped with: (1) Null votes (like in the main text, cf. Eq. (2)) in black; 
(2) Valid Votes or another vote included in the list of choices (cf. Eq. (|S12[) ) in red; (3) citizens who do 
not take part to the election (cf. Eq. (jS13P ') in green. Top: mean values of S, Sb=c, Sb=a, over bins with 
around 100 municipalities of size « N (like in Fig. 4). Bottom: Evolution in time of 5*, Sb=c, Sb=a (with 
the same scale of aggregate data as in Fig. 5). For the sake of clarity, standard deviations over Swiss 
Cantons and Spanish Comunidades autonomas are note shown. Each point (R) for Swiss graph gives 
the average of around 20 Swiss refcrendums. The end of Italian compulsory voting is shown by a 
vertical line. 

Italy E3, Spain and Switzerland El are countries for which Blank votes are distinguished from Null 
votes. Let Nb and Nn the number of citizens who respectively vote Blank and Null amongst N registered 
voters (of one municipality. Canton, Comunidad autonoma, the whole country). Ratios, or probabilities, 
to respectively vote Blank and Null are 

Pn^J^. (Sll) 

In such cases, it is legitimate to consider that Blank votes should be categorized with votes in favor 
of one of a the proposed choices to the election. Then, the Blank vote has not a 'marginal' involvement 
meaning, like previously, but its citizen involvement is similar to another Valid vote according to the list 
of choices of the election. One should then consider a modified involvement entropy, defined from the 
3-sct ratios (of sum unity) {pa, {pc + Phj^Pn], that is 

Sb=c = -Pa log(pa) - {Pc + Pb) log(pc + Pb) - Pn log(p„)- (S12) 

Alternatively, one may consider that Blank votes loose their 'marginal' aspect in citizen involvement, 
and should be categorized as votes from citizen who do not take part to the election. Then the relevant 



^^We only analyze the first question asked in a Referendum. Senate elections are note shown in Fig. IS12"l -below because 
they are very similar to Chamber of Deputies (D) elections. 

^^Chamber of deputies elections (D) distinguish, in our database, Blank vote between Null votes since 1971; and since 
1887 for votations (or referendums). 
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modified involvement entropy, defined from the 3-sct ratios (still of sum unity) {{pa +P6),?'c,Pn}, writes 
as 

Sb=a ^ -{Pa+ Pb) l0g(Pa + Pb) - Pc log(Pc) - Pn log(p„). (S13) 

Figures IS12I shows for Italy, Spain and Switzerland, the involvement entropy, S, and the modified 
versions, Sb=c and Sb=a- (1) for municipalities and with respect to the municipality-size TV (as in Fig. 4); 

(2) for the whole country (directly for Italy, and as a weighted mean by population-size over 25 or 26 
Swiss Cantons and 17 or 19 Spanish Comunidades autonomas) as a function of time (as in Fig. 5). Fig. 8 
shows the modified involvement entropy Sb=c {Sb=a which is not shown, is very close to Sb=c), with 
respect to the involvement entropy 5, for ~' 530 Swiss Referendums. 

Figures IS12I and 8 exhibit some trends and regularities that depend on the values of involvement 
entropy S. (1) When 5 < 1 (e.g. in Italian and Spanish Chamber of Deputies elections, both at 
municipality scale or at large scale of aggregate data), modified involvement entropies are smaller than 
S. This means a greater order of the modified citizen involvement. It can be interpreted as follows: the 
loss of nuance or specificity (for citizen involvement) that Blank vote have, implies a greater polarization 
or heterogeneity of the electorate, still split into 3 groups. (2) When S* sa 1, two different cases arise. 
First, for Spanish European Parliament elections, Swiss Chamber of Deputies elections and Referendums 
(uniquely for the latter, since the 2000s), both at municipality scale or at large scale of aggregate data: 
the modified involvement entropies are slightly lower than S, but still ss 1. Second, for earlier Swiss 
Referendums. and particularly before the 1960s: Sb=c (or Sb=a) are lower than S, but not slightly lower. 

(3) When S > 1 and S ^ 1 (e.g. for Italian European Parliament elections, both at municipality scale or 
at large scale of aggregate data, and Spanish Referendums, particularly 1986 and 2005 ones, at provincial 
scale), modified involvement entropies are still lower than S. But one more time, it is surprising to notice 
that modified involvement entropies are such that Sb=c ~ 1 (or Sb=a ~ !)■ It can be explained as follows: 
subtlety or specificity of citizen involvement due to Blank votes means an increasing of disorder of the 
electorate involvement. The loss of this subtlety or specificity (i.e. considering Blank votes like another 
vote in favor of the list of choices, or like another abstentionist) implies a loss of 'tension' contained in 
electoral campaign. And strikingly, this loss of 'tension' provides a new entropy, like the usual common- 
value of involvement entropy, « 1. 

Note that above items (1) and (3) (i.e. when significantly S* < 1 or S* > 1), pointed out in Fig. lS121 are 
clearly shown in Fig. 8 for Swiss Referendums. (In the Fig. 8, S" means Sb=c, which is very near to Sb=a 
on average.) Note also that the surprising plateau (which provides modified involvement entropies equal 
to « 1, on average, when S > 1.05) does not exist, in our database, for most populated municipalities. For 
the latter case (not shown), the around 100 most populated municipalities for which S > 1.05, uniquely 
provides Sb=c (or Sb=a) lower than 5* such that, on average, Sb=c ~ 1 (or Sb=a ~ 1), but without a 
plateau. 

Lastly, this study does not allow us to know whether it is more meaningful (according to the entropy 
of the electorate involvement) to consider Blank vote like another vote proposed in the list of choices or 
like another abstentionist vote. Nevertheless, in our database. Blank votes seem more meaningful that 
Null Votes in Spain and in Switzerland. Indeed, when p„ and pb are interchanged between each other in 
Eqs. (jS12[) or (|S13p . above item (3) (when significantly S > 1, then the modified involvement entropy is 
w 1) does clearly not exist for Spanish and Swiss Referendums. 

To conclude, let us recall the main point of this section: when involvement entropy S does not obey 
to the common occurrence S' ~ 1 for high population-size municipalities, or at large aggregate scale, 
because the citizen involvement of the electorate is too much disordered (i.e. significantly 5 > 1), then 
the modified involvement entropy (by the loss of the specificity of Blank votes) takes on average the same 
common value ~ 1. 
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